34 research outputs found
Compressing Recurrent Neural Network with Tensor Train
Recurrent Neural Network (RNN) are a popular choice for modeling temporal and
sequential tasks and achieve many state-of-the-art performance on various
complex problems. However, most of the state-of-the-art RNNs have millions of
parameters and require many computational resources for training and predicting
new data. This paper proposes an alternative RNN model to reduce the number of
parameters significantly by representing the weight parameters based on Tensor
Train (TT) format. In this paper, we implement the TT-format representation for
several RNN architectures such as simple RNN and Gated Recurrent Unit (GRU). We
compare and evaluate our proposed RNN model with uncompressed RNN model on
sequence classification and sequence prediction tasks. Our proposed RNNs with
TT-format are able to preserve the performance while reducing the number of RNN
parameters significantly up to 40 times smaller.Comment: Accepted at IJCNN 201
Speech-to-speech Translation between Untranscribed Unknown Languages
In this paper, we explore a method for training speech-to-speech translation
tasks without any transcription or linguistic supervision. Our proposed method
consists of two steps: First, we train and generate discrete representation
with unsupervised term discovery with a discrete quantized autoencoder. Second,
we train a sequence-to-sequence model that directly maps the source language
speech to the target language's discrete representation. Our proposed method
can directly generate target speech without any auxiliary or pre-training steps
with a source or target transcription. To the best of our knowledge, this is
the first work that performed pure speech-to-speech translation between
untranscribed unknown languages.Comment: Accepted in IEEE ASRU 2019. Web-page for more samples & details:
https://sp2code-translation-v1.netlify.com
Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model
A sequence-to-sequence model is a neural network module for mapping two
sequences of different lengths. The sequence-to-sequence model has three core
modules: encoder, decoder, and attention. Attention is the bridge that connects
the encoder and decoder modules and improves model performance in many tasks.
In this paper, we propose two ideas to improve sequence-to-sequence model
performance by enhancing the attention module. First, we maintain the history
of the location and the expected context from several previous time-steps.
Second, we apply multiscale convolution from several previous attention vectors
to the current decoder state. We utilized our proposed framework for
sequence-to-sequence speech recognition and text-to-speech systems. The results
reveal that our proposed extension could improve performance significantly
compared to a standard attention baseline
Gated Recurrent Neural Tensor Network
Recurrent Neural Networks (RNNs), which are a powerful scheme for modeling
temporal and sequential data need to capture long-term dependencies on datasets
and represent them in hidden layers with a powerful model to capture more
information from inputs. For modeling long-term dependencies in a dataset, the
gating mechanism concept can help RNNs remember and forget previous
information. Representing the hidden layers of an RNN with more expressive
operations (i.e., tensor products) helps it learn a more complex relationship
between the current input and the previous hidden layer information. These
ideas can generally improve RNN performances. In this paper, we proposed a
novel RNN architecture that combine the concepts of gating mechanism and the
tensor product into a single model. By combining these two concepts into a
single RNN, our proposed models learn long-term dependencies by modeling with
gating units and obtain more expressive and direct interaction between input
and hidden layers using a tensor product on 3-dimensional array (tensor) weight
parameters. We use Long Short Term Memory (LSTM) RNN and Gated Recurrent Unit
(GRU) RNN and combine them with a tensor product inside their formulations. Our
proposed RNNs, which are called a Long-Short Term Memory Recurrent Neural
Tensor Network (LSTMRNTN) and Gated Recurrent Unit Recurrent Neural Tensor
Network (GRURNTN), are made by combining the LSTM and GRU RNN models with the
tensor product. We conducted experiments with our proposed models on word-level
and character-level language modeling tasks and revealed that our proposed
models significantly improved their performance compared to our baseline
models.Comment: Accepted at IJCNN 2016 URL :
http://ieeexplore.ieee.org/document/7727233